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Mathematical Logic as based on the Theory of Types. 

By Bertrand Russell. 



The following theory of symbolic logic recommended itself to me in the first 
instance by its ability to solve certain contradictions, of which the one best 
known to mathematicians is Burali-Forti's concerning the greatest ordinal.* But 
the theory in question seems not wholly dependent on this indirect recom- 
mendation ; it has also, if I am not mistaken, a certain consonance with common 
sense which makes it inherently credible. This, however, is not a merit upon 
which much stress should be laid; for common sense is far more fallible than it 
likes to believe. I shall therefore begin by stating some of the contradictions to 
be solved, and shall then show how the theory of logical types effects their 
solution. 

I. 

The Contradictions. 

(1) The oldest contradiction of the kind in question is the Epimenides. 
Epimenides the Cretan said that all Cretans were liars, and all other statements 
made by Cretans were certainly lies. Was this a lie ? The simplest form of this 
contradiction is afforded by the man who says " I am lying;" if he is lying, he 
is speaking the truth, and vice versa. 

(2) Let w be the class of all those classes which are not members of them- 
selves. Then, whatever class x may be, " x is a w " is equivalent f to " x is not 
an x." Hence, giving to x the value w, "w is a w" is equivalent to "w is not 
a w." 

(3) Let T be the relation which subsists between two relations R and 8 
whenever R does not have the relation R to 8. Then, whatever relations R and 
/S'may be, " R has the relation Tto 8" is equivalent to " R does not have the 

*See below. 

| Two propositions are called equivalent when both are true or both are false. 
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relation B to 8." Hence, giving the value Tto both B and 8, " Thas the rela- 
tion Tto T" is equivalent to " Tdoes not have the relation Tto T." 

(4) The number of syllables in the English names of finite integers tends 
to increase as the integers grow larger, and must gradually increase indefinitely, 
since only a finite number of names can be made with a given finite number of 
syllables. Hence the names of some integers must consist of at least nineteen 
syllables, and among these there must be a least. Hence " the least integer not 
nameable in fewer than nineteen syllables" must denote a definite integer; in 
fact, it denotes 111,777. But "the least integer not nameable in fewer than 
nineteen syllables " is itself a name consisting of eighteen syllables ; hence the 
least integer not nameable in fewer than nineteen syllables can be named in 
eighteen syllables, which is a contradiction.* 

(5) Among transfinite ordinals some can be defined, while others can not ; 
for the total number of possible definitions is a , while the number of trans- 
finite ordinals exceeds tf . Hence there must be indefinable ordinals, and 
among these there must be a least. But this is defined as "the least indefinable 
ordinal," which is a contradiction.f 

(6) Richard's paradox { is akin to that of the least indefinable ordinal. It 
is as follows : Consider all decimals that can be defined by means of a finite 
number of words; let E be the class of such decimals. Then E has K terms; 

hence its members can be ordered as the 1st, 2nd, 3rd, Let N be a number 

defined as follows : If the nth figure in the nth decimal is p, let the nth. figure 
in N be p -f- 1 (or 0, if p = 9). Then N is different from all the members of E, 
since, whatever finite value n may have, the nth figure in N is different from the 
nth figure in the nth of the decimals composing E, and therefore N is different 
from the nth decimal. Nevertheless we have defined N in a finite number of 
words, and therefore N ought to be a member of E. Thus N both is and is not 
a member of E. 

(7) Burali-Forti's contradiction § may be stated as follows : It can be shown 

•This contradiction was suggested to me by Mr. G. G. Berry of the Bodleian Library. 

f Of. Konig, " Ueber die Gmndlagen der Mengenlehre und das Kontinuumproblem," Math. Annalen, Vol. 
LXI (1905); A. C. Dixon, "On 'well-ordered' aggregates," Proc. London Math. Soc, Series 2, Vol. IV, Part I 
(1906); and E. W. Hobson, "On the Arithmetic Continuum," ibid. The solution offered in the last of these 
papers does not seem to me adequate. 

%Cf. Poincar6, "Les mathematiques et la logique," Bevue de Metaphysique et de Morale, Mai, 1906, especially 
sections VII and IX; also Peano, Bevista de Mathematica, Vol. VIII, No. 5 (1906), p. 149 ff. 

§ "Una questione sui numeri transflniti," Bendiconti del eircolo matematico di Palermo, Vol. XI (1897). 



224 Russell : Mathematical Logic as based on the Theory of Types. 

that every well-ordered series has an ordinal number, that the series of ordinals 
up to and including any given ordinal exceeds the given ordinal by one, and (on 
certain very natural assumptions) that the series of all ordinals (in order of 
magnitude) is well-ordered. It follows that the series of all ordinals has an 
ordinal number, X2 say. But in that case the series of all ordinals including XI 
has the ordinal number X2 + 1, which must be greater than XI. Hence XI is not 
the ordinal number of all ordinals. 

In all the above contradictions (which are merely selections from an 
indefinite number) there is a common characteristic, which we may describe as 
self-reference or reflexiveness. The remark of Epimenides must include itself 
in its own scope If all classes, provided they are not members of themselves, 
are members of w, this must also apply to w ; and similarly for the analogous 
relational contradiction. In the cases of names and definitions, the paradoxes 
result from considering non-nameability and indefinability as elements in names 
and definitions. In the case of Burali-Forti's paradox, the series whose ordinal 
number causes the difficulty is the series of all ordinal numbers. In each con- 
tradiction something is said about all cases of some kind, and from what is said 
a new case seems to be generated, which both is and is not of the same kind as 
the cases of which all were concerned in what was said. Let us go through the 
contradictions one by one and see how this occurs. 

(1) When a man says " I am lying," we may interpret his statement as : 
"There is a proposition which I am affirming and which is false." All state- 
ments that " there is " so-and-so may be regarded as denying that the opposite 
is always true; thus "I am lying" becomes: "It is not true of all propositions 
that either I am not affirming them or they are true;" in other words, "It is 
not true for all propositions p that if I affirm p, p is true." The paradox 
results from regarding this statement as affirming a proposition, which must 
therefore come within the scope of the statement. This, however, makes it 
evident that the notion of " all propositions" is illegitimate ; for otherwise, there 
must be propositions (such as the above) which are about all propositions, and 
yet can not, without contradiction, be included among the propositions they are 
about. Whatever we suppose to be the totality of propositions, statements about 
this totality generate new propositions which, on pain of contradiction, must lie 
outside the totality. It is useless to enlarge the totality, for that equally 
enlarges the scope of statements about the totality. Hence there must be no 
totality of propositions, and "all propositions" must be a meaningless phrase. 
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(2) In this case, the class w is defined by reference to " all classes," and then 
turns out to be one among classes. If we seek help by deciding that no class is 
a member of itself, then w becomes the class of all classes, and we have to decide 
that this is not a member of itself, i. e., is not a class. This is only possible if 
there is no such thing as the class of all classes in the sense required by the 
paradox. That there is no such class results from the fact that, if we suppose 
there is, the supposition immediately gives rise (as in the above contradiction) 
to new classes lying outside the supposed total of all classes. 

(3) This case is exactly analogous to (2), and shows that we can not 
legitimately speak of " all relations." 

(4) " The least integer not nameable in fewer than nineteen syllables " 
involves the totality of names, for it is "the least integer such that all names 
either do not apply to it or have more than nineteen syllables." Here we 
assume, in obtaining the contradiction, that a phrase containing " all names " is 
itself a name, though it appears from the contradiction that it can not be one of 
the names which were supposed to be all the names there are. Hence " all 
names " is an illegitimate notion. 

(5) This case, similarly, shows that " all definitions " is an illegitimate 
notion. 

(6) This is solved, like (5), by remarking that "all definitions" is an 
illegitimate notion. Thus the number E is not defined in a finite number of 
words, being in fact not defined at all.* 

(7) Burali-Forti's contradiction shows that "all ordinals" is an illegitimate 
notion ; for if not, all ordinals in order of magnitude form a well-ordered series, 
which must have an ordinal number greater than all ordinals. 

Thus all our contradictions have in common the assumption of a totality 
such that, if it were legitimate, it would at once be enlarged by new members 
defined in terms of itself. 

This leads us to the rule : " Whatever involves all of a collection must not 
be one of the collection ; " or, conversely : " If, provided a certain collection had 
a total, it would have members only definable in terms of that total, then the 
said collection has no total." f 

* Of. " Les paradoxes de la logique," by the present author, Revue de MUaphysique et de Morale, Sept., 1906, 
p. 645. 

■j- When I say that a collection has no total, I mean that statements about all its members are nonsense. 
Furthermore, it will be found that the use of this principle requires the distinction of all and any considered in 
Section II. 

30 
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The above principle is, however, purely negative in its scope. It suffices to 
show that many theories are wrong, but it does not show how the errors are to 
be rectified. We can not say: "When I speak of all propositions, I mean all 
except those in which 'all propositions' are mentioned;" for in this explanation 
we have mentioned the propositions in which all propositions are mentioned, 
which we can not do significantly. It is impossible to avoid mentioning a thing 
by mentioning that we won't mention it. One might as well, in talking to a 
man with a long nose, say : " When I speak of noses, I except such as are inor- 
dinately long," which would not be a very successful effort to avoid a painful 
topic. Thus it is necessary, if we are not to sin against the above negative 
principle, to construct our logic without mentioning such things as ''all propo- 
sitions " or " all properties," and without even having to say that we are 
excluding such things The exclusion must result naturally and inevitably from 
our positive doctrines, which must make it plain that "all propositions" and 
"all properties" are meaningless phrases. 

The first difficulty that confronts us is as to the fundamental principles of 
logic known under the quaint name of " laws of thought." "All propositions 
are either true or false," for example, has become meaningless. If it were 
significant, it would be a proposition, and would come under its own scope. 
Nevertheless, some substitute must be found, or all general accounts of deduction 
become impossible. 

Another more special difficulty is illustrated by the particular case of 
mathematical induction. We want to be able to say : " If n is a finite integer, 
n has all properties possessed by and by the successors of all numbers possess- 
ing them." But here "all properties" must be replaced by some other phrase 
not open to the same objections. It might be thought that " all properties pos- 
sessed by and by the successors of all numbers possessing them" might be 
legitimate even if "all properties" were not. But in fact this is not so. We 
shall find that phrases of the form "all properties which etc." involve all prop, 
erties of which the "etc." can be significantly either affirmed or denied, and not 
only those which in fact have whatever characteristic is in question ; for, in the 
absence of a catalogue of properties having this characteristic, a statement 
about all those that have the characteristic must be hypothetical, and of the 
form : " It is always true that, if a property has the said characteristic, then 
etc." Thus mathematical induction is prima facie incapable of being significantly 
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enunciated, if " all properties " is a phrase destitute of meaning. This difficulty, 
as we shall see later, can be avoided ; for the present we must consider the laws 
of logic, since these are far more fundamental. 

II. 

All and Any. 

Given a statement containing a variable x, say " x = x," we may affirm that 
this holds in all instances, or we may affirm any one of the instances without 
deciding as to which instance we are affirming. The distinction is roughly the 
same as that between the general and particular enunciation in Euclid. The 
general enunciation tells us something about (say) all triangles, while the par- 
ticular enunciation takes one triangle, and asserts the same thing of this one 
triangle. But the triangle taken is any triangle, not some one special triangle ; 
and thus although, throughout the proof, only one triangle is dealt with, yet the 
proof retains its generality. If we say : " Let ABC be a triangle, then the sides 
AB, AG are together greater than the side BC," we are saying something about 
one triangle, not about all triangles ; but the one triangle concerned is absolutely 
ambiguous, and our statement consequently is also absolutely ambiguous. We 
do not affirm any one definite proposition, but an undetermined one of all the 
propositions resulting from supposing ABC to be this or that triangle. This 
notion of ambiguous assertion is very important, and it is vital not to confound 
an ambiguous assertion with the definite assertion that the same thing holds in 
all cases. 

The distinction between (1) asserting any value of a propositional function, 
and (2) asserting that the function is always true, is present throughout mathe- 
matics, as it is in Euclid's distinction of general and particular enunciations. In 
any chain of mathematical reasoning, the objects whose properties are being 
investigated are the arguments to any value of some propositional function. 
Take as an illustration the following definition : 

"We call f{x) continuous for x = a if, for every positive number a, different 
from 0, there exists a positive number e, different from 0, such that, for all 
values of h which are numerically less than e, the difference /(a + S) — f(a) is 
numerically less than a." 

Here the function /is any function for which the above statement has a 
meaning; the statement is about /, and varies as / varies. But the statement 
is not about <r or e or 8, because all possible values of these are concerned, not 
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one undetermined value. (In regard to s, the statement "there exists a positive 
number e such that etc." is the denial that the denial of "etc." is true of all 
positive numbers.) For this reason, when any value of a propositional function 
is asserted, the argument (e. g., f in the above) is called a real variable ; whereas, 
when a function is said to be always true, or to be not always true, the argument 
is called an apparent variable.* Thus in the above definition, f is a real 
variable, and a, e, <5 are apparent variables. 

When we assert any value of a propositional function, we shall say simply 
that we assert the propositional function. Thus if we enunciate the law of 
identity in the form " x = x" we are asserting the function " x = x ;" i. e., we 
are asserting any value of this function. Similarly we may be said to deny a 
propositional function when we deny any instance of it. We can only truly 
assert a propositional function if, whatever value we choose, that value is true • 
similarly we can only truly deny it if, whatever value we choose, that value is 
false. Hence in the general case, in which some values are true and some false, 
we can neither assert nor deny a propositional function."}" 

If tyx is a propositional function, we will denote by >l (%) . tyx" the propo- 
sition " <px is always true." Similarly " (x, y) . 4> (%, y)" will mean " 4> (x, y) is 
always true," and so on. Then the distinction between the assertion of all 
values and the assertion of any is the distinction between (1) asserting (x) . $x 
and (2) asserting tyx where x is undetermined. The latter differs from the 
former in that it can not be treated as one determinate proposition. 

The distinction between asserting q>x and asserting (x) . <px was, I believe, 
first emphasized by Frege.J His reason for introducing the distinction explicitly 
was the same which had caused it to be present in the practice of mathematicians; 
namely, that deduction can only be effected with real variables, not with apparent 
variables. In the case of Euclid's proofs, this is evident: we need (say) some 
one triangle ABC to reason about, though it does not matter what triangle it is. 
The triangle ABC is a real variable ; and although it is any triangle, it remains 
the same triangle throughout the argument. But in the general enunciation, 



*These two terms are due to Peano, who uses them approximately in the above sense. Of., e.g., Formulaire 
MatUmatique, Vol. IV, p. 5 (Turin, 1903). 

t Mr. MacColl speaks of "propositions" as divided into the three classes of certain, variable, ana im- 
possible. We may accept this division as applying to propositional functions. A function which can be 
asserted is certain, one which can be denied is impossible, and all others are (in Mr. MacColl's sense) variable. 

J See his Orundgesetze der Arithmetik, Vol. I (Jena, 1893), §17, p. 81. 
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the triangle is an apparent variable. If we adhere to the apparent variable, we 
can not perform any deductions, and this is why in all proofs, real variables 
have to be used. Suppose, to take the simplest case, that we know " tyx is 
always true," i.e. "(x) . fyx" and we know "<px always implies 4^," i. e. "(x) . \tyx 
implies <fyx\" How shall we infer " <fyx is always true," i.e. " (x) . ^x?" We 
know it is always true that if tyx is true, and if tyx implies ^x, then fyx is true. 
But we have no premises to the effect that tyx is true and <px implies ^x; what 
we have is : cpx is always true, and <px always implies ^x. In order to make our 
inference, we must go from "$x is always true" to tyx, and from " <px always 
implies i^x" to " <px implies ^x," where the x, while remaining any possible 
argument, is to be the same in both. Then, from " <px " and " $x implies fyx" 
we infer li ^x;" thus <fyx is true for any possible argument, and therefore is 
always true. Thus in order to infer " {x) . ^x from " (x) . tyx " and " (x) . \tyx 
implies 4 /X \ f " w e have to pass from the apparent to the real variable, and then 
back again to the apparent variable. This process is required in all mathematical 
reasoning which proceeds from the assertion of all values of one or more propo- 
sitional functions to the assertion of all values of some other propositional 
function, as, e. g., from "all isosceles triangles have equal angles at the base" to 
" all triangles having equal angles at the base are isosceles." In particular, this 
process is required in proving Barbara and the other moods of the syllogism. 
In a word, all deduction operates with real variables (or with constants). 

It might be supposed that we could dispense with apparent variables 
altogether, contenting ourselves with any as a substitute for all. This, however, 
is not the case. Take, for example, the definition of a continuous function quoted 
above : in this definition a, s, and o must be apparent variables. Apparent 
variables are constantly required for definitions. Take, e. g., the following : 
"An integer is called a prime when it has no integral factors except 1 and itself." 
This definition unavoidably involves an apparent variable in the form : " If n is 
an integer other than 1 or the given integer, n is not a factor of the given integer, 
for all possible values of n " 

The distinction between all and any is, therefore, necessary to deductive 
reasoning, and occurs throughout mathematics ; though, so far as I know, its 
importance remained unnoticed until Frege pointed it out. 

For our purposes it has a different utility, which is very great. In the case 
of such variables as propositions or properties, "any value" is legitimate, though 
"all values" is not. Thus we may say: "p is true or false, where p is any 
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proposition," though we can not say " all propositions are true or false." The 
reason is that, in the former, we merely affirm an undetermined one of the 
propositions of the form " p is true or false," whereas in the latter we affirm (if 
anything) a new proposition, different from all the propositions of the form 
"p is true or false." Thus we may admit "any value" of a variable in cases 
where " all values " would lead to reflexive fallacies ; for the admission of " any 
value" does not in the same way create new values. Hence the fundamental 
laws of logic can be stated concerning any proposition, though we can not 
significantly say that they hold of all propositions. These laws have, so to 
speak, a particular enunciation but no general enunciation. There is no one 
proposition which is the law of contradiction (say) ; there are only the various 
instances of the law. Of any proposition p. we can say: "p and not-p can 
not both be true;" but there is no such proposition as: "Every proposition p 
is such that p and not-jp can not both be true." 

A similar explanation applies to properties. We can speak of any property 
of x, but not of all properties, because new properties would be thereby 
generated. Thus we can say : " If n is a finite integer, and if has the prop- 
erty ty, and m + 1 has the property q> provided rn has it, it follows that n has 
the property <p." Here we need not specify <|> ; <£ stands for "any property." 
But we can not say : "A finite integer is defined as one which has every property 
q> possessed by and by the successors of possessors." For here it is essential 
to consider every property,* not any property ; and in using such a definition we 
assume that it embodies a property distinctive of finite integers, which is just 
the kind of assumption from which, as we saw, the reflexive contradictions 
spring. 

In the above instance, it is necessary to avoid the suggestions of ordinary 
language, which is not suitable for expressing the distinction required. The 
point may be illustrated further as follows: If induction is to be used for defining 
finite integers, induction must state a definite property of finite integers, not an 
ambiguous property. But if $ is a real variable, the statement " n has the 
property q> provided this property is possessed by and by the successors of 
possessors" assigns to n a property which varies as $ varies, and such a property 
can not be used to define the class of finite integers. We wish to say : " 'n is a 
finite integer ' means : ' Whatever property ty may be, n has the property $ pro- 

*This is indistinguishable from "all properties." 
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vided $ is possessed by and by the successors of possessors.' " But here <£> has 
become an aj>parent variable. To keep it a real variable, we should have to say : 
"Whatever property <£> may be, 'n is a finite integer' means: ' n has the property 
<£> provided <£ is possessed by and by the successors of possessors.' " But here 
the meaning of 'n is a finite integer' varies as <£> varies, and thus such a definition 
is impossible. This case illustrates an important point, namely the following : 
" The scope * of a real variable can never be less than the whole propositional 
function in the assertion of which the said variable occurs." That is, if our 
propositional function is (say) "$>z implies £>," the assertion of this function will 
mean " any value of ' $x implies p' is true," not " 'any value of <px is true' im- 
plies p." In the latter, we have really " all values of $x are true," and the x is 
an apparent variable. 

III. 

The Meaning and Range of Generalized Propositions. 

In this section we have to consider first the meaning of propositions in 
which the word all occurs, and then the kind of collections which admit of 
propositions about all their members. 

It is convenient to give the name generalized propositions not only to such as 
contain all, but also to such as contain some (undefined). The proposition " $x 
is sometimes true" is equivalent to the denial of " not-<px is always true;" 
"some A is B" is equivalent to the denial of "all A is not B;" i. <?., of "no A 
is B." Whether it is possible to find interpretations which distinguish " <px is 
sometimes true " from the denial of " not-^cc is always true," it is unneces- 
sary to inquire ; for our purposes we may define " tyx is sometimes true" as the 
denial of "not-<pa; is always true." In any case, the two kinds of propositions 
require the same kind of interpretation, and are subject to the same limitations. 
In each there is an apparent variable ; and it is the presence of an apparent 
variable which constitutes what I mean by a generalized proposition. (Note 
that there can not be a real variable in any proposition; for what contains a real 
variable is a propositional function, not a proposition.) 

The first question we have to ask in this section is : How are we to interpret 
the word all in such propositions as "all men are mortal?" At first sight, it 
might be thought that there could be no difficulty, that " all men" is a perfectly 

* The scope of a real variable is the whole function of which "any value" is in question. Thus in <>tp x 
implies p " the scope of x is not (j>x, but " <j>x implies p." 
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clear idea, and that we say of all men that they are mortal. But to this view 
there are many objections. 

(1) If this view were right, it would seem that " all men are mortal" could 
not be true if there were no men. Yet, as Mr. Bradley has urged,* "Trespassers 
will be prosecuted " may be perfectly true even if no one trespasses ; and hence, 
as he further argues, we are driven to interpret such propositions as hypotheticals, 
meaning "if anyone trespasses, he will be prosecuted;" i. e., "if x trespasses, 
x will be prosecuted," where the range of values which x may have, whatever it 
is, is certainly not confined to those who really trespass. Similarly "all men 
are mortal" will mean "if a? is a man, a; is mortal, where x may have any value 
within a certain range." What this range is, remains to be determined ; but in 
any case it is wider than " men," for the above hypothetical is certainly often 
true when x is not a man. 

(2) "All men" is a denoting phrase; and it would appear, for reasons which 
I have set forth elsewhere, f that denoting phrases never have any meaning in 
isolation, but only enter as constituents into the verbal expression of proposi- 
tions which contain no constituent corresponding to the denoting phrases in 
question. That is to say, a denoting phrase is defined by means of the propo- 
sitions in whose verbal expression it occurs. Hence it is impossible that these 
propositions should acquire their meaning through the denoting phrases ; we 
must find an independent interpretation of the propositions containing such 
phrases, and must not use these phrases in explaining what such propositions 
mean. Hence we can not regard "all men are mortal" as a statement about 
"all men." 

(3) Even if there were such an object as " all men," it is plain that it is 
not this object to which we attribute mortality when we say "all men are 
mortal." If we were attributing mortality to this object, we should have to say 
" all men is mortal." Thus the supposition that there is such an object as " all 
men" will not help us to interpret ''all men are mortal." 

(4) It seems obvious that, if we meet something which may be a man or may 
be an angel in disguise, it comes within the scope of " all men are mortal " to 
assert " if this is a man, it is mortal." Thus again, as in the case of the tres- 
passers, it seems plain that we are really saying " if anything is a man, it is 
mortal," and that the question whether this or that is a man does not fall within 
the scope of our assertion, as it would do if the all really referred to " all men." 

* Logic, Part I, Chapter II. f " On Denoting,". Mind, October, 1905. 
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(5) We thus arrive at the view that what is meant by "all men are mortal" 
may be more explicitly stated in some such form as " it is always true that if x 
is a man, x is mortal." Here we have to inquire as to the scope of the word 
always. 

(6) It is obvious that always includes some cases in which x is not a man, 
as we saw in the case of the disguised angel. If x were limited to the case 
when » is a man, we could infer that a; is a mortal, since if a; is a man, a; is a 
mortal. Hence, with the same meaning of always, we should find " it is always 
true that x is mortal." But it is plain that, without altering the meaning of 
always, this new proposition is false, though the other was true. 

(7) One might hope that " always" would mean "for all values of a;." 
But "all values of x," if legitimate, would include as parts "all propositions" 
and " all functions," and such illegitimate totalities. Hence the values of x 
must be somehow restricted within some legitimate totality. This seems to lead 
us to the traditional doctrine of a " universe of discourse " within which x must 
be supposed to lie. 

(8) Yet it is quite essential that we should have some meaning of always 
which does not have to be expressed in a restrictive hypothesis as to x. For 
suppose "always" means "whenever x belongs to the class i." Then "all men 
are mortal" becomes "whenever x belongs to the class i, if a; is a man, x is 
mortal ;" i. e., "it is always true that if x belongs to the class i, then, if a; is a 
man, x is mortal." But what is our new always to mean ? There seems no more 
reason for restricting x, in this new proposition, to the class i, than there was 
before for restricting it to the class man. Thus we shall be led on to a new wider 
universe, and so on ad infinitum,, unless we can discover some natural restriction 
upon the possible values of (i. e., some restriction given with) the function "if a; 
is a man, x is mortal," and not needing to be imposed from without. 

(9) It seems obvious that, since all men are mortal, there can not be any 
false proposition which is a value of the function " if a; is a man, x is mortal." 
For if this is a proposition at all, the hypothesis "a; is a man" must be a propo- 
sition, and so must the conclusion " x is mortal." But if the hypothesis is false, 
the hypothetical is true ; and if the hypothesis is true, the hypothetical is true. 
Hence there can be no false propositions of the form "if a; is a man, x is 
mortal." 

(10) It follows that, if any values of x are to be excluded, they can only be 
values for which there is no proposition of the form "if a; is a man, x is mortal;" 
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i. e., for which this phrase is meaningless. Since, as we saw in (7), there must be 
excluded values of x, it follows that the function "if x is a man, x is mortal" 
must have a certain range of significance* which falls short of all imaginable 
values of x, though it exceeds the values which are men. The restriction on x 
is therefore a restriction to the range of significance of the function " if as is a 
man, x is mortal." 

(11) We thus reach the conclusion that "all men are mortal" means "if x 
is a man, x is mortal, always," where always means "for all values of the 
function ' if x is a man, x is mortal.' " This is an internal limitation upon x, 
given by the nature of the function ; and it is a limitation which does not require 
explicit statement, since it is impossible for a function to be true more generally 
than for all its values. Moreover, if the range of significance of the function 
is i, the function "if x is an i, then if a; is a man, x is mortal" has the same 
range of significance, since it can not be significant unless its constituent "if x 
is a man, x is mortal " is significant. But here the range of significance is again 
implicit, as it was in 'if a; is a man, x is mortal ;' thus we can not make ranges 
of significance explicit, since the attempt to do so only gives rise to a new 
proposition in which the same range of significance is implicit. 

Thus generally : " (x) . <px" is to mean " <px always." This may be inter- 
preted, though with less exactitude, as " tyx is always true," or, more explicitly : 
"All propositions of the form <px are true," or "All values of the function tyx 
are true."f Thus the fundamental all is "all values of a propositional function," 
and every other all is derivative from this. And every propositional function 
has a certain range of significance, within which lie the arguments for which the 
function has values. Within this range of arguments, the function is true or 
false; outside this range, it is nonsense. 

The above argumentation may be summed up as follows : 

The difficulty which besets attempts to restrict the variable is, that 
restrictions naturally express themselves as hypotheses that the variable is of 
such or such a kind, and that, when so expressed, the resulting hypothetical is 
free from the intended restriction. For example, let us attempt to restrict the 

*A function is said to be significant for the argument x if it has a value for this argument. Thus we may 
say shortly «<px is significant," meaning "the function <j> has a value for the argument x." The range of 
significance of a function consists of all the arguments for which the function is true, together with all the 
arguments for which it is false. 

+A linguistically convenient expression for this idea is : " if>x is true for all possible values of x," a possible 
value being understood to be one for which fx is significant. 
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variable to men, and assert that, subject to this restriction, u x is mortal" is 
always true. Then what is always true is that if x is a man, x is mortal ; and 
this hypothetical is true even when x is not a man. Thus a variable can never 
be restricted within a certain range if the propositional function in which the 
variable occurs remains significant when the variable is outside that range. But 
if the function ceases to be significant when the variable goes outside a certain 
range, then the variable is ipso facto confined to that range, without the need of 
any explicit statement to that effect. This principle is to be borne in mind in 
the development of logical types, to which we shall shortly proceed. 

We can now begin to see how it comes that " all so-and-so's " is sometimes 
a legitimate phrase and sometimes not. Suppose we say " all terms which have 
the property <p have the property 4 1 " That means, according to the above 
interpretation, " $x always implies 4>x" Provided the range of significance of 
q>x is the same as that of ^x, this statement is significant; thus, given any 
definite function tyx, there are propositions about " all the terms satisfying $x." 
But it sometimes happens (as we shall see more fully later on) that what appears 
verbally as one function is really many analogous functions with different ranges 
of significance. This applies, for example, to "p is true," which, we shall find, 
is not really one function of p, but is different functions according to the kind 
of proposition that p is. In such a case, the phrase expressing the ambiguous 
function may, owing to the ambiguity, be significant throughout a set of values 
of the argument exceeding the range of significance of any one function. In 
such a case, all is not legitimate. Thus if we try to say " all true propositions 
have the property $," i. e., " 'p is true ' always implies typ," the possible argu- 
ments to 'p is true ' necessarily exceed the possible arguments to <£>, and there- 
fore the attempted general statement is impossible. For this reason, genuine 
general statements about all true propositions can not be made. It may happen, 
however, that the supposed function ty is really ambiguous like l p is true ;' and 
if it happens to have an ambiguity precisely of the same kind as that of ( p is 
true,' we may be able always to give an interpretation to the proposition " 'p is 
true' implies <pp." This will occur, e.g., if typ is "not-p is false." Thus we 
get an appearance, in such cases, of a general proposition concerning all propo- 
sitions ; but this appearance is due to a systematic ambiguity about such words 
as true and false. (This systematic ambiguity results from the hierarchy of 
propositions which will be explained later on). We may, in all such cases, make 
oar statement about any proposition, since the meaning of the ambiguous words 
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will adapt itself to any proposition. But if we turn our proposition into an 
apparent variable, and say something about all, we must suppose the ambiguous 
words fixed to this or that possible meaning, though it may be quite irrelevant 
which of their possible meanings they are to have. This is how it happens both 
that all has limitations which exclude "all propositions," and that there never- 
theless seem to be true statements about " all propositions." Both these points 
will become plainer when the theory of types has been explained. 

It has often been suggested* that what is required in order that it may be 
legitimate to speak of all of a collection is that the collection should be finite. 
Thus " all men are mortal " will be legitimate because men form a finite class. 
But that is not really the reason why we can speak of "all men." What is 
essential, as appears from the above discussion, is not finitude, but what may be 
called logical homogeneity. This property is to belong to any collection whose 
terms are all contained within the range of significance of some one function. 
It would always be obvious at a glance whether a collection possessed this 
property or not, if it were not for the concealed ambiguity in common logical 
terms such as true and false, which gives an appearance of being a single function 
to what is really a conglomeration of many functions with different ranges of 
significance. 

The conclusions of this section are as follows : Every proposition containing 
all asserts that some propositional function is always true ; and this means that 
all values of the said function are true, not that the function is true for all argu- 
ments, since there are arguments for which any given function is meaningless, 
i. e., has no value. Hence we can speak of all of a collection when and only 
when the collection forms part or the whole of the range of significance of 
some propositional function, the range of significance being defined as the 
collection of those arguments for which the function in question is significant, 
i. e., has a value. 

IV. 

The Hierarchy of Types. 

A type is defined as the range of significance of a propositional function, 
i. e., as the collection of arguments for which the said function has values. 
Whenever an apparent variable occurs in a proposition, the range of values of the 
apparent variable is a type, the type being fixed by the function of which " all 

*E. g., by M. Poincarfi, fievue de Mctaphysique et de Morale, Mai, 1906. 
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values" are concerned. The division of objects into types is necessitated by the 
reflexive fallacies which otherwise arise. These fallacies, as we saw, are to 
be avoided by what may be called the "vicious-circle principle;" i.e., "no 
totality can contain members defined in terms of itself." This principle, in our 
technical language, becomes : " Whatever contains an apparent variable must 
not be a possible value of that variable." Thus whatever contains an apparent 
variable must be of a different type from the possible values of that variable ; 
we will say that it is of a higher type. Thus the apparent variables contained 
in an expression are what determines its type. This is the guiding principle in 
what follows. 

Propositions which contain apparent variables are generated from such as 
do not contain these apparent variables by processes of which one is always the 
process of generalization, i. e., the substitution of a variable for one of the terms 
of a proposition, and the assertion of the resulting function for all possible 
values of the variable. Hence a proposition is called a generalized proposition 
when it contains an apparent variable. A proposition containing no apparent 
variable we will call an elementary proposition. It is plain that a proposition 
containing an apparent variable presupposes others from which it can be 
obtained by generalization; hence all generalized propositions presuppose 
elementary propositions. In an elementary proposition we can distinguish one 
or more terms from one or more concepts ; the terms are whatever can be regarded 
as the subject of the proposition, while the concepts are the predicates or relations 
asserted of these terms.* The terms of elementary propositions we will call 
individuals; these form the first or lowest type. 

It is unnecessary, in practice, to know what objects belong to the lowest 
type, or even whether the lowest type of variable occurring in a given context 
is that of individuals or some other. For in practice only the relative types of 
variables are relevant ; thus the lowest type occurring in a given context may 
be called that of individual so far as that context is concerned. It follows that 
the above account of individuals is not essential to the truth of what follows ; 
all that is essential is the way in which other types are generated from indi- 
viduals, however the type of individuals may be constituted. 

By applying the process of generalization to individuals occurring in 
elementary propositions, we obtain new propositions. The legitimacy of this 

*See Principles of Mathematics, §48. 
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process requires only that no individuals should be propositions. That this is 
so, is to be secured by the meaning we give to the word individual. We may 
define an individual as something destitute of complexity ; it is then obviously 
not a proposition, since propositions are essentially complex. Hence in applying 
the process of generalization to individuals we run no risk of incurring reflexive 
fallacies. 

Elementary propositions together with such as contain only individuals as 
apparent variables we will call first-order propositions. These form the second 
logical type. 

We have thus a new totality, that of first-order propositions. We can thus 
form new propositions in which first-order propositions occur as apparent 
variables. These we will call second-order propositions ; these form the third 
logical type. Thus, e.g., if Epimenides asserts "all first-order propositions 
affirmed by me are false," he asserts a second-order proposition ; he may assert 
this truly, without asserting truly any first-order proposition, and thus no con- 
tradiction arises. 

The above process can be continued indefinitely. The n + 1th logical type 
will consist of propositions of order n, which will be such as contain propositions 
of order n — 1, but of no higher order, as apparent variables. The types so 
obtained are mutually exclusive, and thus no reflexive fallacies are possible so 
long as we remember that an apparent variable must always be confined within 
some one type. 

In practice, a hierarchy of functions is more convenient than one of propo- 
sitions. Functions of various orders may be obtained from propositions of 
various orders by the method of substitution. If p is a proposition, and a a con- 
stituent of p, let "pja'-x" denote the proposition which results from substi- 
tuting x for a wherever a occurs in p. Then p/a, which we will call a matrix, 
may take the place of a function ; its value for the argument x is p/a ; x, and its 
value for the argument a is p. Similarly, if "p/(a, b) '• (x, y)" denotes the result 
of first substituting x for a and then substituting y for b, we may use the double 
matrix p/{a, b) to represent a double function. In this way we can avoid 
apparent variables other than individuals and propositions of various orders. 
The order of a matrix will be defined as being the order of the proposition in 
which the substitution is effected, which proposition we will call the prototype. 
The order of a matrix does not determine its type : in the first place because it 
does not determine the number of arguments for which others are to be substi- 
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tuted (i.e., whether the matrix is of the form p/a or p/{a,b) or pj(a,b,c) 
etc.) ; in the second place because, if the prototype is of more than the first 
order, the arguments may be either propositions or individuals. But it is plain 
that the type of a matrix is definable always by means of the hierarchy of 
propositions. 

Although it is possible to replace functions by matrices, and although this 
procedure introduces a certain simplicity into the explanation of types, it is 
technically inconvenient. Technically, it is convenient to replace the prototype 
p by <pa, and to replace p/a ; x by $x ; thus where, if matrices were being em- 
ployed, p and a would appear as apparent variables, we now have <£> as our 
apparent variable. In order that <p may be legitimate as an apparent variable, 
it is necessary that its values should be confined to propositions of some one type. 
Hence we proceed as follows. 

A function whose argument is an individual and whose value is always a 
first-order proposition will be called a first-order function. A function involving 
a first-order function or proposition as apparent variable will be called a second- 
order function, and so on. A function of one variable which is of the order 
next above that of its argument will be called a predicative function ; the same 
name will be given to a function of several variables if there is one among these 
variables in respect of which the function becomes predicative when values are 
assigned to all the other variables. Then the type of a function is determined 
by the type of its values and the number and type of its arguments. 

The hierarchy of functions may be further explained as follows. A first- 
order function of an individual x will be denoted by <£> ! x (the letters 4 1 , X, $> 
/, g, F, O will also be used for functions). No first-order function contains a 
function as apparent variable ; hence such functions form a well-defined totality, 
and the <^> in q> ! x can be turned into an apparent variable. Any proposition in 
which <J> appears as apparent variable, and there is no apparent variable of higher 
type than <£>, is a second-order proposition. If such a proposition contains an 
individual x, it is not a predicative function of x; but if it con'ains a first-order 
function <£>, it is a predicative function of <j>, and will be written / ! (4 1 ! 2). Then 
/ is a second-order predicative function ; the possible values of / again form a 
well-defined totality, and we can turn / into an apparent variable. We can 
thus define third-order predicative functions, which will be such as have third- 
order propositions for their values and second-order predicative functions for 
their arguments. And in this way we can proceed indefinitely. A precisely 
similar development applies to functions of several variables. 
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We will adopt the following conventions. Variables of the lowest type 
occurring in any context will be denoted by small Latin letters (excluding / 
and g, which are reserved for functions) ; a predicative function of an argument 
x (where x may be of any type) will be denoted by q> ! x (where 4", %> 6, f, 9, F 
or G may replace <£>) ; similarly a predicative function of two arguments x and y 
will be denoted by $ ! (x, y) ; a general function of x will be denoted by q>x, and 
a general function of x and y by ty(x, y). In tyx, <£> can not be made into an 
apparent variable, since its type is indeterminate ; but in q> ! x, where $ is a 
predicative function whose argument is of sonde given type, $ can be made into 
an apparent variable. 

It is important to observe that since there are various types of propositions 
and functions, and since generalization can only be applied within some one type, 
all phrases containing the words "all propositions" or "all functions" are 
prima facie meaningless, though in certain cases they are capable of an unob- 
jectionable interpretation. The contradictions arise from the use of such phrases 
in cases where no innocent meaning can be found. 

If we now revert to the contradictions, we see at once that some of them 
are solved by the theory of types. Wherever "all propositions" are mentioned, 
we must substitute "all propositions of orders," where it is indifferent what 
value we give to n, but it is essential that n should have some value. Thus when 
a man says "I am lying," we must interpret him as meaning: "There is a 
proposition of order n, which I affirm, and which is false." This is a proposition 
of order n + 1 ; hence the man is not affirming any proposition of order n ; 
hence his statement is false, and yet its falsehood does not imply, as that of 
" I am lying " appeared to do, that he is making a true statement. This solves 
the liar. 

Consider next " the least integer not nameable in fewer than nineteen 
syllables." It is to be observed, in the first place, that nameable must mean 
"nameable by means of such-and-such assigned names," and that the number of 
assigned names must be finite. For if it is not finite, there is no reason why 
there should be any integer not nameable in fewer than nineteen syllables, and 
the paradox collapses. We may next suppose that " nameable in terms of names 
of the class N" means "being the only term satisfying some function composed 
wholly of names of the class N." The solution of this paradox lies, I think, in 
the simple observation that " nameable in terms of names of the class N" is 
never itself nameable in terms of names of that class. If we enlarge N by 
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adding the name "nameable in terms of names of the class N," our fundamental 
apparatus of names is enlarged; calling the new apparatus N', " nameable in 
terms of names of the class N' " remains not nameable in terms of names of the 
class N'. If we try to enlarge JSf till it embraces all names, "nameable" be- 
comes (by what was said above) " being the only term satisfying some function 
composed wholly of names." But here there is a function as apparent variable ; 
hence we are confined to predicative functions of some one type (for non-predi- 
cative functions can not be apparent variables). Hence we have only to observe 
that nameability in terms of such functions is non-predicative in order to escape 
the paradox. 

The case of " the least indefinable ordinal " is closely analogous to the case 
we have just discussed. Here, as before, "definable" must be relative to some 
given apparatus of fundamental ideas ; and there is reason to suppose that 
" definable in terms of ideas of the class N" is not definable in terms of ideas 
of the class N. It will be true that there is some definite segment of the series 
of ordinals consisting wholly of definable ordinals, and having the least inde- 
finable ordinal as its limit. This least indefinable ordinal will be definable by a 
slight enlargement of our fundamental apparatus ; but there will then be a new 
ordinal which will be the least that is indefinable with the new apparatus. If 
we enlarge our apparatus so as to include all possible ideas, there is no longer 
any reason to believe that there is any indefinable ordinal. The apparent force 
of the paradox lies largely, I think, in the supposition that if all the ordinals 
of a certain class are definable, the class must be definable, in which case its 
successor is of course also definable ; but there is no reason for accepting this 
supposition. 

The other contradictions, that of Burali-Forti in particular, require some 
further developments for their solution. 

V. 

The Axiom of Reducibility. 

A propositional function of x may, as we have seen, be of any order ; hence 
any statement about "all properties of x" is meaningless. (A "property of x " 
is the same thing as a "propositional function which holds of x.") But it is 
absolutely necessary, if mathematics is to be possible, that we should have some 
method of making statements which will usually be equivalent to what we have 
in mind when we (inaccurately) speak of " all properties of x." This necessity 
32 
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appears in many cases, but especially in connection with mathematical induction. 
We can say, by the use of any instead of all, "Any property possessed by 0, 
and by the successors of all numbers possessing it, is possessed by all finite 
numbers." But we can not go on to : "A finite number is one which possesses 
all properties possessed by and by the successors of all numbers possessing 
them." If we confine this statement to all first-order properties of numbers, we 
can not infer that it holds of second-order properties. For example, we shall 
be unable to prove that if m, n are finite numbers, then rn, + n is a finite number. 
For, with the above definition, " m is a finite number " is a second-order property 
of in ; hence the fact that in -J- is a finite number, and that, if rn + n is a 
finite number, so is in + n + 1, does not allow us to conclude by induction that 
m + n is a finite number. It is obvious that such a state of things renders much 
of elementary mathematics impossible. 

The other definition of finitude, by the non-similarity of whole and part, 
fares no better. For this definition is : "A class is said to be finite when every 
one-one relation whose domain is the class and whose converse domain is 
contained in the class has the whole class for its converse domain." Here a 
variable relation appears, i. e., a variable function of two variables ; we 
have to take all values of this function, which requires that it should be of some 
assigned order ; but any assigned order will not enable us to deduce many of 
the propositions of elementary mathematics. 

Hence we must find, if possible, some method of reducing the order of a 
propositional function without affecting the truth or falsehood of its values. 
This seems to be what common-sense effects by the admission of classes. Given 
any propositional function q>x, of whatever order, this is assumed to be equivalent, 
for all values of x, to a statement of the form "x belongs to the class a." Now 
this statement is of the first order, since it makes no allusion to "all functions 
of such-and-such a type." Indeed its only practical advantage over the original 
statement <px is that it is of the first order. There is no advantage in assuming 
that there really are such things as classes, and the contradiction about the 
classes which are not members of themselves shows that, if there are classes, 
they must be something radically different from individuals. I believe the chief 
purpose which classes serve, and the chief reason which makes them linguistically 
convenient, is that they provide a method of reducing the order of a propositional 
function. I shall, therefore, not assume anything of what may seem to be 
involved in the common-sense admission of classes, except this: that every 
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propositional function is equivalent, for all its values, to some predicative 
function. 

This assumption with regard to functions is to be made whatever may be 
the type of their arguments. Let q>x be a function, of any order, of an argu- 
ment x, which may itself be either an individual or a function of any order. 
If $ is of the order next above x, we write the function in the form $ ! x ; in 
such a case we will call ty a predicative function. Thus a predicative function of 
an individual is a first-order function ; and for higher types of arguments, 
predicative functions take the place that first-order functions take in respect of 
individuals. We assume, then, that every function is equivalent, for all its 
values, to some predicative function of the same argument. This assumption 
seems to be the essence of the usual assumption of classes ; at any rate, it 
retains as much of classes as we have any use for, and little enough to avoid the 
contradictions which a less grudging admission of classes is apt to entail. We 
will call this assumption the axiom of classes, or the axiom of reducibility. 

We shall assume similarly that every function of two variables is equivalent, 
for all its values, to a predicative function of those variables, where a predicative 
function of two variables is one such that there is one of the variables in respect 
of which the function becomes predicative (in our previous sense) when a value 
is assigned to the other variable. This assumption is what seems to be meant 
by saying that any statement about two variables defines a relation between 
them. We will call this assumption the axiom of relations or the axiom of 
reducibility. 

In dealing with relations between more than two terms, similar assumptions 
would be needed for three, four, . . . variables. But these assumptions are not 
indispensable for our purpose, and are therefore not made in this paper. 

By the help of the axiom of reducibility, statements about " all first-order 
functions of x" or "all predicative functions of a" yield most of the results 
which otherwise would require " all functions." The essential point is that such 
results are obtained in all cases where only the truth or falsehood of values of 
the functions concerned are relevant, as is invariably the case in mathematics. 
Thus mathematical induction, for example, need now only be stated for all 
predicative functions of numbers; it then follows from the axiom of classes 
that it holds of any function of whatever order. It might be thought that the 
paradoxes for the sake of which we invented the hierarchy of types would now 
reappear. But this is not the case, because, in such paradoxes, either something 
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beyond the truth or falsehood of values of functions is relevant, or expressions 
occur which are unmeaning even after the introduction of the axiom of reduci- 
bility. For example, such a statement as "Epimenides asserts <fyx" is n °t 
equivalent to " Bpimenides asserts <?> ! x," even though -fyx and <£> ! x are equiv- 
alent. Thus "I am lying" remains unmeaning if we attempt to include all 
propositions among those which I may be falsely affirming, and is \inaffected by 
the axiom of classes if we confine it to propositions of order n. The hierarchy 
of propositions and functions, therefore, remains relevant in just those cases in 
which there is a paradox to be avoided. 

VI. 
Primitive Ideas and Propositions of Symbolic Logic. 

The primitive ideas required in symbolic logic appear to be the following 
seven : 

(1) Any propositional function of a variable x or of several variables 
x, y, z, . . . This will be denoted by tyx or ty(x, y f z, . .) 

(2) The negation of a proposition. If p is the proposition, its negation 
will be denoted by ~ p. 

(3) The disjunction or logical sum of two propositions; i. e., "this or that." 
If p, q are the two propositions, their disjunction will be denoted by pyq* 

(4) The truth of any value of a propositional function ; *'. e., of tyx where x 
is not specified. 

(5) The truth of all values of a propositional function. This is denoted by 
(x) . tyx or (a;) : tyx or whatever larger number of dots may be necessary to 
bracket off the proposition.f In (x) . fyx, x is called an apparent variable, 
whereas when <[>x is asserted, where x is not specified, x is called a real variable. 

(6) Any predicative function of an argument of any type ; this will be 
represented by $ ! x or <p ! a or <J> ! B, according to circumstances. A predica- 
tive function of x is one whose values are propositions of the type next above 
that of x, if x is an individual or a proposition, or' that of values of x if x is a 

*In a previous article in this journal, I took implication as indefinable, instead of disjunction. The 
choice between the two is a matter of taste; I now choose disjunction, because it enables us to diminish the 
number of primitive propositions. 

t The use of dots follows Peano's usage. It is fully explained by Mr. Whitehead, "On Cardinal Num- 
bers," American Journal of Mathematics, Vol. XXTV, and "On Mathematical Concepts of the Material 
World," Phil. Trans. A., Vol. CCV, p. 472. 
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function. It may be described as one in which the apparent variables, if any, 
are all of the same type as x or of lower type ; and a variable is of lower type 
than x if it can significantly occur as argument to x, or as argument to an argu- 
ment to x, etc. 

(7) Assertion; i. e., the assertion that some proposition is true, or that any 
value of some propositional function is true. This is required to distinguish a 
proposition actually asserted from one merely considered, or from one adduced 
as hypothesis to some other. It will be indicated by the sign " |- " prefixed to 
what is asserted, with enough dots to bracket off what is asserted.* 

Before proceeding to the primitive propositions, we need certain definitions. 
In the following definitions, as well as in the primitive propositions, the letters 
p, q, r are used to denote propositions. 

p~}q. = . ~pyq Df. 

This definition states that "p^)q" (which is read "p implies q ") is to mean 
"p is false or q is true." I do not mean to affirm that "implies" can not have 
any other meaning, but only that this meaning is the one which it is most con- 
venient to give to " implies " in symbolic logic. In a definition, the sign of 
equality and the letters "Df" are to be regarded as one symbol, meaning jointly 
"is defined to mean." The sign of equality without the letters "Df" has a 
different meaning, to be defined shortly. 

p.q. = . ~ (~^>v ~ 9 1 ) Df. 

This defines the logical product of two propositions p and q, i. e., "p and q 
are both true." The above definition states that this is to mean : " It is false 
that either p is false or q is false." Here again, the definition does not give the 
only meaning which can be given to "p and q are both true," but gives the 
meaning which is most convenient for our purposes. 

p = q. = .p^)q.q^)p Df. 

That is, " p = q, ,} which is read " p is equivalent to q," means " p implies q 

and q implies p;" whence, of course, it follows that p and q are both true or 

both false. 

(8[x).$x. = . ~{(aj). ~$a;} Df. 

* This sign, as well as the introduction of the idea which it expresses, are due to Frege. See his JBegriffs- 
schrift (Halle, 1879), p. 1, and Grundgesetze der Arithmetik, Vol. I (Jena, 1893), p. 9. 
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This defines " there is at least one value of x for which <px is true." We 
define it as meaning " it is false that $x is always false." 

x = y . = : ($) : $ ! x .3 . <?> ! y Df. 

This is the definition of identity. It states that x and y are to be called 
identical when every predicative function satisfied by x is satisfied by y. It 
follows from the axiom of reducibility that if x satisfies ij/;r, where ^ is any 
function, predicative or non-predicative, then y satisfies ^y. 

The following definitions are less important, and are introduced solely for 
the purpose of abbreviation. 

0, y) . <p(x, y). =:(»): {y) . $(x, y) Df. 

(Sx, y) . tfx, y). - : (ffas) : (%) . <f>{x, y) Df. 

^.)ji'^: = : (a;) : 4»a5 ) ^ Df. 

<£>x . =j. . ^ '. = : (&) : $x . = . ^a; Df. 

$(*> 2/) • x v • ^fa y) : = : ( x > y) : ¥?> y) • ) • ^(^ 2/) Df -> 

and so on for any number of variables. 

The primitive propositions required are as follows. (In 2, 3, 4, 5, 6, and 10, 
p, q, r stand for propositions.) 

(1) A proposition implied by a true premise is true. 

(2) I- ipyp.J.p. 

(3) k'.g.J.pvq. 

(4) [ :pyq.^.q\jp. 

(5) I- :pM(q\r).^.q\j(p\ir). 

(6) f- : . q~}r .^):p\iq.^) .p\r. 

(7) [ : (x) . q>x . } . $y ; 

i. e., " if all values of tyx are true, then tyy is true, where <py is any value." * 

(8) If tyy is true, where tyy is any value of tyx, then (x) .tyx is true. This 
can not be expressed in our symbols ; for if we write "tyy . 3 . (x) . q>x," that means 
"<py implies that all values of <px are true, where y may have any value of the 
appropriate type," which is not in general the case. What we mean to assert is : 
"If, however y is chosen, tyy is true, then (x) . $x is true;" whereas what is 
expressed by " <py . 3 • («) . <?># " is: "However y is chosen, if tyy is true, then 
(x) . <px is true," which is -quite a different statement, and in general a false one. 

*It is convenient to use the notation tyx to denote the function itself, as opposed to this or that value of 
the function. 
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(9) [ : (x) . <px . 3 . 4>a, where a is any definite constant. 

This principle is really as many different principles as there are possible 
values of a. I. e., it states that, e. g., whatever holds of all individuals holds 
of Socrates; also that it holds of Plato ; and so on. It is the principle that a 
general rule may be applied to particular cases ; but in order to give it scope, it 
is necessary to mention the particular cases, since otherwise we need the principle 
itself to assure us that the general rule that general rules may be applied to 
particular cases may be applied (say) to the particular case of Socrates. It is 
thus that this principle differs from (7); our present principle makes a statement 
about Socrates, or about Plato, or some other definite constant, whereas (7) 
made a statement about a variable. 

The above principle is never used in symbolic logic or in pure mathematics, 
since all our propositions are general, and even when (as in " one is a number ") 
we seem to have a strictly particular case, this turns out not to be so when 
closely examined. In fact, the use of the above principle is the distinguishing 
mark of applied mathematics. Thus, strictly speaking, we might have omitted 
it from our list. 

(10) [ : . («) . p v <?># • 3 'P> V • 0*0 • $ x 'i 

i. <?., "if ' p or tyx' is always true, then either^? is true, or <px is always true." 

(11) When f(<px) is true whatever argument x may be, and F(tyy) is true 
whatever possible argument y may be, then \f($>x) . F(q>x)\ is true whatever 
possible argument x may be. 

This is the axiom of the " identification of variables." It is needed when 
two separate propositional functions are each known to be always true, and we 
wish to infer that their logical product is always true. This inference is only 
legitimate if the two functions take arguments of the same type, for otherwise 
their logical product is meaningless. In the above axiom, x and y must be of 
the same type, because both occur as arguments to <p. 

(12) If tpx . tyx ) i^a; is true for any possible x, then ^x is true for any 
possible x. 

This axiom is required in order to assure us that the range of significance 
of ^x, in the case supposed, is the same as that of <px . <px ) -^x . ) . $x ; both 
are in fact the same as that of tyx. We know, in the case supposed, that ^>x is true 
whenever tyx . <px ) ^x and q>x . tyx } $x . ) . $x are both significant, but we do 
not know, without an axiom, that $x is true whenever fyx is significant. Hence 
the need of the axiom. 
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Axioms (11) and (12) are required, e.g., in proving 

(x) . fyx : (x) . <jta ) 4® O • ( x ) • 4%- 
By (7) and (11), 

\- : .(x).<px:(x) . $£ ) 4% > ) J <?>y • $y 3 ^ 
whence by ( 1 2), 

(■•..(^.^•.(a;).^)^:): 4y, 

whence the result follows by (8) and (10). 

(13) \ : . (3/) :.(»):$». = . /!x. 

This is the axiom of reducibility. It states that, given any function q>x, 
there is a predicative function fix such that f\x is always equivalent to 4>a?. 
Note that, since a proposition beginning with " (#"/)" is, by definition, the nega- 
tion of one beginning with " (/)," the above axiom involves the possibility of 
considering "all predicative functions of x." If tyx is any function of x, we can 
not make propositions beginning with " (<£>) " or " (&<p)," since we can not con- 
sider "all functions," but only "any function" or " all predicative functions." 

(14) [ : . (ST/) : . (x, y) : ftx, y) . = ./! (x, y). 

This is the axiom of reducibility for double functions. 

In the above propositions, our x and y may be of any type whatever. The 
only way in which the theory of types is relevant is that (ll) only allows us to 
identify real variables occurring in different contents when they are shown to be 
of the same type by both occurring as arguments to the same function, and that, 
in (7) and (9), y and a must respectively be of the appropriate type for argu- 
ments to <£>2. Thus, for example, suppose we have a proposition of the form 
(<£>) ./! ($! z, x), which is a second-order function of x. Then by (7), 

h :(*)./!(*! 3, a) O./I (4 1 M), 

where 4 • » IS anv first-order function. But it will not do to treat (<£>) ./! (<£ ! z, x) 
as if it were a first-order function of x, and take this function as a possible 
value of 4 • z in the above. It is such confusions of types that give rise to the 
paradox of the liar. 

Again, consider the classes which are not members of themselves. It is 
plain that, since we have identified classes with functions,* no class can be 
significantly said to be or not to be a member of itself; for the members of a 
class are arguments to it, and arguments to a function are always of lower type 

* This identification is subject to a modification to be explained shortly. 
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than the function. And if we ask : " But how about the class of all classes ? 
Is not that a class, and so a member of itself? ", the answer is twofold. First, 
if "the class of all classes" means "the class of all classes of whatever type," 
then there is no such notion. Secondly, if " the class of all classes " means 
" the class of all classes of type t," then this is a class of the next type above t, 
and is therefore again not a member of itself. 

Thus although the above primitive propositions apply equally to all types, 
they do not enable us to elicit contradictions. Hence in the course of any 
deduction it is never necessary to consider the absolute type of a variable ; it is 
only necessary to see that the different variables occurring in one proposition are 
of the proper relative types. This excludes such functions as that from which 
our fourth contradiction was obtained, namely : "The relation R holds between 
R and #." For a relation between R and S is necessarily of higher type than 
either of them, so that the proposed function is meaningless. 

VII. 

Elementary Theory of Classes and Relations. 

Propositions in which a function q> occurs may depend, for their truth-value, 
upon the particular function <£>, or they may depend only upon the extension of 
<£>, i. e., upon the arguments which satisfy <p. A function of the latter sort we 
will call extensional. Thus, e.g., "I believe that all men are mortal" may not 
be equivalent to " I believe that all featherless bipeds are mortal," even if men 
are coextensive with featherless bipeds ; for I may not know that they are 
coextensive. But "all men are mortal" must be equivalent to "all featherless 
bipeds are mortal " if men are coextensive with featherless bipeds. Thus " all 
men are mortal" is an extensional function of the function "a; is a man," while 
"I believe all men are mortal " is a function which is not extensional; we will 
call functions intensional when they are not extensional. The functions of 
functions with which mathematics is specially concerned are all extensional. 
The mark of an extensional function / of a function <£ ! z is 

<p\x.= x .^\x\^:f[$\z). = .f($\z). 

From any function / of a function q> \ z we can derive an associated exten- 
sional function as follows. Put 

f\z($z)\ . = : (S$) : $ I x. = x . ^x : f{<p \z\ Df. 
33 
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The function f\z(^z)\ is in reality a function of 4>% though not the same 
function as f$z), supposing this latter to be significant. But it is convenient to 
treat f\z{^z)) technically as though it had an argument z(tyz), which we call 
" the class defined by 4'" We have 

[ :.<px.= x .4>x:i: f\z{<i>z)\ .ee./\z(4>z)\, 

whence, applying to the fictitious objects z{q>z) and 2(4-2) the definition of identity 
given above, we find 

\ :.q>x. = x . 4& O • K$ z ) = K^ z )' 

This, with its converse (which can also be proved), is the distinctive 
property of classes. Hence we are justified in treating 2(^2) as the class defined 
by <£>. In the same way we put 

f\xyWx,y)\.=:m):$l(x,y).= x , y .4>(x,y):f\<l>\(x,y)\ Df. 

A few words are necessary here as to the distinction between <p\(x, y) and 
<£ ! (y, x). We will adopt the following convention : When a function (as opposed 
to its values) is represented in a form involving x and y, or any other two letters 
of the alphabet, the value of this function for the arguments a and b is to be 
found by substituting a for x and b for y ; i. e., the argument mentioned first is 
to be substituted for the letter which comes earlier in the alphabet, and the 
argument mentioned second for the later letter. This sufficiently distinguishes 
between <£> ! (x, y) and ty ! (y, x) ; e. g. : 



The value of 


*> 


(x, y) for arguments 


a, b is <£> ! (a, b). 


a a 




a tt a 




b, a " <p\ (b, a). 


it it 


<?> 


\{y,x) " 




a,b " <p\ (b, a). 


it u 




it tt a 
xefy I z . = . <£> ! x 


Df., 


b,a " <£> ! (a, b). 



We put 

whence 

[ : . xez (4-2) . = : (&$) :$\y .= y .4>y.q>\x. 

Also by the reducibility-axiom we have 

{Zct>):<p\y.= y .ty, 

whence 

|- : xez ($z) . == . ipx. 
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This holds whatever x may be. Suppose now we want to consider 
z{4>z) e<f>/ { z(<p ! 2) [ • We have, by the above, 

h : . zfa) 4f{Z{<p !«)}. = :f{KW \: = :(Z<t>):<!>h.= y .4<y:f{<plz}, 
whence 

where % is written for any expression of the form q>f\z($> \z)\. 

We put 

els = a{ (3» . a = z{$ ! 2) } Df. 

Here els has a meaning which depends upon the type of the apparent variable q>. 
Thus, e. g., the proposition " els s els" which is a consequence of the above 
definition, requires that "els" should have a different meaning in the two places 
where it occurs. The symbol "els" can only be used where it is unnecessary to 
know the type; it has an ambiguity which adjusts itself to circumstances. If 
we introduce as an indefinable the function "Indivla;," meaning "x is an 
individual," we may put 

Kl = &{(&$). a = z(<p\z.Indiv I z)\ Df. 

Then Kl is an unambiguous symbol meaning " classes of individuals." 

We will use small Greek letters (other than s, ty, 4 1 , %, 6) to represent 

classes of whatever type; i.e., to stand for symbols of the form z(<p.\z) or z((pz). 
The theory of classes proceeds, from this point on, much as in Peano's 

system ; z($z) replaces z3{$>z). Also I put 

aC/?. = :scea.) .xefi Df. 

H ! a . = . @Tx)'. xea Df. 

V= x{x = x) Df. 

A = a{~ {x-=-x)\ Df. 

where A, as with Peano, is the null-class. The symbols ST, A, V, like els and e, 
are ambiguous, and only acquire a definite meaning when the type concerned is 
otherwise indicated. 

We treat relations in exactly the same way, putting 

a\Q\{x, y)\b. = .<p\(a, b) Df. 
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(the order being determined by the alphabetical order of x and y and the typo- 
graphical order of a and b) ; whence 

h : . a\xyi>{x, y)\b. = : (#» : ^(cc, y) • =* „ • <?> ! («, y) • <?> ! («i 6 ), 

whence, by the reducibility-axiom, 

I- :a\xy^(x,y)\b. = .^{a, b). 

We use Latin capital letters as abbreviations for such symbols as xy^(x, y), 
and we find 

|- : . R = S . == : aify . = Xi y . &/$/, 
where 

R = S. = :flR.O f .flS Df. 
We put 

Eel = ^(af4>).i2 = aiy0!(aj,y)} Df., 

and we find that everything proved for classes has its analogue for dual relations. 
Following Peano, we put 

a r\ (3 = & (xsa . xe@) Df,, 

defining the product, or common part, of two classes ; 

a W /? = x (xea . v . xefl) Df., 

defining the sum of two classes ; and 

— a = x \ ~ (xea) \ Df., 
defining the negation of a class. Similarly for relations we put 

Rr,S=xy(xRy.xSy) Df. 

RO S = xy{xRy .\i .xSy) Df. 

-R = xy\~{xRy)\ Df. 

VIII. 

Descriptive Functions. 

The functions hitherto considered have been propositional functions, with 
the exception of a few particular functions such R rs S. But the ordinary 
functions of mathematics, such as a; 2 , sin x, log x, are not propositional. 
Functions of this kind always mean "the term having such-and-such a relation 
to x." For this reason they may be called descriptive functions, because they 
describe a certain term by means of its relation to their argument. Thus "sin n/2" 
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describes the number 1 ; yet propositions in which sin n/2 occurs are not the same 
as they would be if 1 were substituted. This appears, e. g., from the proposition 
"sin 7t/2 = 1," which conveys valuable information, whereas "1 = 1" is 
trivial. Descriptive functions have no meaning by themselves, but only as con- 
stituents of propositions ; and this applies generally to phrases of the form " the 
term having such-and-such a property." Hence in dealing with such phrases, 
we must define any proposition in which they occur, not the phrases themselves.* 
We are thus led to the following definition, in which " (ix) {$x) " is to be read 
" the term x which satisfies <px." 

4>\(ix) (<px) \. = : {Kb) :<px.= x .x = b:4>b Df. 

This definition states that "the term which satisfies $ satisfies ^" is to 
mean: "There is a term b such that tyx is true when and only when x is b, 
and $b- is true." Thus all propositions about " the so-and-so" will be false if 
there are no so-and-so's or several so-and-so's. 

The general definition of a descriptive function is 

B'y = (ix) (xBy) Df.; 

that is, "B'y" is to mean "the term which has the relation R to y." If there 
are several terms or none having the relation B to y, all propositions about B'y 
will be false. We put 

El(ix) ($cc). = :(fi[b): $x. = x .x = b Df. 

Here "E \ (ix) (tyx) " may be read " there is such a term as the x which satisfies 
<px," or " the x which satisfies <px exists." We have 

h '. . E\ Ey . = '. {Sb)\xBy .= x .x = b. 

The inverted comma in B'y may be read of. Thus if B is the relation of father 
to son, " B'y" is "the father of y." If B is the relation of son to father, all 
propositions about B'y will be false unless y has one son and no more. 

From the above it appears that descriptive functions are obtained from 
relations. The relations now to be defined are chiefly important on account of 
the descriptive functions to which they give rise. 

Cnv= QP\xQy.= x , y .yPx\ Df. 



*See the above-mentioned article " On Denoting," where the reasons for this view are given at length. 
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Here Onv is short for "converse." It is the relation of a relation to its converse; 

e. g. } of greater to less, of parentage to sonship, of preceding to following, etc. 

We have 

r . Cnv'P = (iQ) \xQy. =.,„ . yPx\. 

For a shorter notation, often more convenient, we put 

P = Cnv e P Df. 

We want next a notation for the class of terms which have the relation R to y. 

For this purpose, we put 

->- 

R = ay \a = x(xRy)\ Df, 

whence 

\ . R'y = x(xRy). 
Similarly we put 

R = $x\(3 = y(xRy)\ Df., 
whence 

Y . R'x = y (xRy). 

We want next the domain of R (i. e., the class of terms which have the 
relation R to something), the converse domain of R (i. e., the class of terms to 
which something has the relation R), and the field of R, which is the sum of the 
domain and the converse domain. For this purpose we define the relations of 
the domain, converse domain, and field, to R. The definitions are : 

D = aR\a = x((Zy) .xRy)\ Df. 

a = $R{P = y {{Hx) . xRy) \ Df. 

C = yR\ r = x((3:y):xRy.\i.yRx)\ Df. 

Note that the third of these definitions is only significant when R is what we 
may call a homogeneous relation; i. e., one in which, if xRy holds, x and y are of 
the same type. For otherwise, however we may choose x and y, either xRy or 
yRx will be meaningless. This observation is important in connection with 
Burali-Forti's contradiction. 

We have, in virtue of the above definitions, 

Y.D'R = x{(Hy).xRy\, 
\.a'R = y\{Kx).xRy\, 
\ . C'R = x { (%) : xRy . v • yBx\, 
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the last of these being significant only when B is homogeneous. " D'B " is 
read " the domain of B;" "G'B" is read "the converse domain of B" and 
" C'B " is read " the field of B." The letter C is chosen as the initial of the 
word "campus." 

We want next a notation for the relation, to a class a contained in the 
domain of B } of the class of terms to which some member of a has the relation 
B, and also for the relation, to a class /3 contained in the converse domain of B, 
of the class of terms which have the relation B to some member of (3. For the 
second of these we put 

B € = a(3\a = x ((%) . ye{3 . xBy) f Df. 
So that 

\.R;p = x\{Zy). ye p.xBy\. 

Thus if B is the relation of father to son, and (3 is the class of Etonians, B e '(3 
will be the class "fathers of Etonians ;" if B is the relation "less than," and 
(3 is the class of proper fractions of the form 1 — 2~ n for integral values of 
n, Be(3 will be the class of fractions less than some fraction of the form 
1 — 2~ n ; i. e., B^(3 will be the class of proper fractions. The other relation 
mentioned above is (R) e . 

We put, as an alternative notation often more convenient, 

B"(3=:B;i3 Df. 

The relative product of two relations B, S is the relation which holds 
between x and z whenever there is a term y such that xBy and yBz both hold. 
The relative product is denoted by R \ S. Thus 

B\S = xz\{S[y).xBy.yBz\ Df. 
We put also 

B? = B\B Df. 

The product and sum of a class of classes are often required. They are 

defined as follows : 

s'x = x { (&a) . aex . xea\ Df. 

p x = x {aex . ~y a . xsa \ Df. 

Similarly for relations we put 

s c ^ = xy { (KB) . Beh . xBy \ Df. 
p% = xy { Bs% . ~y B . xBy } Df. 
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We need a notation for the class whose only member is x. Peano uses ix, 

hence we shall use Cx. Peano showed (what Prege also had emphasized) that 

this class can not be identified with x. With the usual view of classes, the need 

for such a distinction remains a mystery ; but with the view set forth above, it 

becomes obvious. 

We put 

t = ax \ a = y (y = x) \ Df., 
whence 

1- . i c x = y (y = x), 
and 

[■ : E\ Va . ) . Va = (ix) (xea); 

i. e., if a is a class which has only one member, then Va is that one member.* 
For the class of classes contained in a given class, we put 

GYa = $(PCa) Df. 

We can now proceed to the consideration of cardinal and ordinal numbers, 
and of how they are affected by the doctrine of types. 

IX. 

Cardinal Numbers. 

The cardinal number of a class a is defined as the class of all classes similar 
to a, two classes being similar when there is a one-one relation between them. 
The class of one-one relations is denoted by | -&• j , and defined as follows : 

l-» 1 = B \ xBy . x!By . xRy' . X, f v< ^ y , . x = x'. y = y' \ Df. 

Similarity is denoted by Sim; its definition is 

Sim = d$ \(ZR) . Bel ^ 1. D'B = a . d'B = p\ Df. 

Then Sim 'a is, by definition, the cardinal number of a ; this we will denote by 
Nca ; hence we put 

Nc = S"im Df., 
whence 

\- . Nca = Sim 'a. 

*Thus " e a is what Peano calls ia. 
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The class of cardinals we will denote by NG ; thus 

NG— Nc'cls Df. 

is defined as the class whose only member is the null-class, A, so that 

= t'A Df. 
The definition of 1 is 

1 = a \ (He) : xsa, . =3 . x = c \ Df. 

It is easy to prove that and 1 are cardinals according to the definition. 

It is to be observed, however, that and 1 and all the other cardinals, 
according to the above definitions, are ambiguous symbols, like els, and have as 
many meanings as there are types. To begin with : the meaning of depends 
upon that of A, and the meaning of A is different according to the type of 
which it is the null-class. Thus there are as many O's as there are types; and 
the same applies to all the other cardinals. Nevertheless, if two classes a, /? 
are of different types, we can speak of them as having the same cardinal, or of 
one as having a greater cardinal than the other, because a one-one relation may 
hold between the members of a and the members of /?, even when a and /3 are 
of different types. For example, let /3 be "a; i. e., the class whose members are 
the classes consisting of single members of a. Then "a is of higher type than a, 
but similar to a, being correlated with a by the one-one relation i. 

The hierarchy of types has important results in regard to addition. 
Suppose we have a class of a terms and a class of (3 terms, where a and /? are 
cardinals ; it may be quite impossible to add them together to get a class of 
a and (3 terms, since, if the classes are not of the same type, their logical sum 
is meaningless. Where only a finite number of classes are concerned, we can 
obviate the practical consequences of this, owing to the fact that we can always 
apply operations to a class which raise its type to any required extent without 
altering its cardinal number. For example, given any class a, the class "a has 
the same cardinal number, but is of the next type above a. Hence, given any 
finite number of classes of different types, we can raise all of them to the type 
which is what we may call the lowest common multiple of all the types in 
question; and it can be shown that this can be done in such a way that the 
resulting classes shall have no common members. We may then form the logical 
sum of all the classes so obtained, and its cardinal number will be the arith- 
metical sum of the cardinal numbers of the original classes. But where we 
34 
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have an infinite series of classes of ascending types, this method can not be 
applied. For this reason, we can not now prove that there must be infinite 
classes. For suppose there were only n individuals altogether in the universe, 
where n is finite. There would then be 2™ classes of individuals, and 2 s " classes 
of classes of individuals, and so on. Thus the cardinal number of terms in each 
type would be finite; and though these numbers would grow beyond any assigned 
finite number, there would be no way of adding them so as to get an infinite 
number. Hence we need an axiom, so it would seem, to the effect that no finite 
class of individuals contains all individuals; but if any one chooses to assume 
that the total number of individuals in the universe is (say) 10,367, there seems 
no a priori way of refuting his opinion. 

From the above mode of reasoning, it is plain that the doctrine of types 
avoids all difficulties as to the greatest cardinal. There is a greatest cardinal in 
each type, namely the cardinal number of the whole of the type ; but this is 
always surpassed by the cardinal number of the next type, since, if a is the 
cardinal number of one type, that of the next type is 2 a , which, as Cantor has 
shown, is always greater than a. Since there is no way of adding different 
types, we can not speak of " the cardinal number of all objects, of whatever 
type," and thus there is no absolutely greatest cardinal. 

If it is admitted that no finite class of individuals contains all individuals, 
it follows that there are classes of individuals having any finite number. Hence 
all finite cardinals exist as individual-cardinals; i.e., as the cardinal numbers of 
classes of individuals. It follows that there is a class of tf cardinals, namely, 
the class of finite cardinals. Hence tf exists as the cardinal of a class of classes 
of classes of individuals. By forming all classes of finite cardinals, we find 
that 2 No exists as the cardinal of a class of classes of classes of classes of indi- 
dividuals ; and so we can proceed indefinitely. The existence of a n for every 
finite value of n can also be proved ; but this requires the consideration of 
ordinals. 

If, in addition to assuming that no finite class contains all individuals, we 
assume the multiplicative axiom (i. e., the axiom that, given a set of mutually 
exclusive classes, none of which are null, there is at least one class consisting of 
one member from each class in the set), then we can prove that there is a class of 
individuals containing K members, so that No will exist as an individual-cardinal. 
This somewhat reduces the type to which we have to go in order to prove the 
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existence-theorem for any given cardinal, but it does not give us any existence- 
theorem which can not be got otherwise sooner or later. 

Many elementary theorems concerning cardinals require the multiplicative 
axiom.* It is to be observed that this axiom is equivalent to Zermelo's,f and 
therefore to the assumption that every class can be well-ordered. J These 
equivalent assumptions are, apparently, all incapable of proof, though the mul- 
tiplicative axiom, at least, appears highly self-evident. In the absence of proof, 
it seems best not to assume the multiplicative axiom, but to state it as a 
hypothesis on every occasion on which it is used. 

X. 

Ordinal Numbers. 

An ordinal number is a class of ordinally similar well-ordered series, i. e., 
of relations generating such series. Ordinal similarity or likeness is defined as 
follows : 

Smor = PQ \qSCS) . &1-»1. d'S = C'Q . P = S\ Q\S\ Df, 
where "Smor" is short for "similar ordinally." 

The class of serial relations, which we will call " Ser," is defined as 
follows : 

Ser = P \xPy . J XtV . ~ (x = y) : xPy . yPz . \,, ( . xPz : 

xs C'P Ox • Ps W ix U P'x = C'P \ Df. 

That is, reading P as "precedes," a relation is serial if (l) no term pre- 
cedes itself, (2) a predecessor of a predecessor is a predecessor, (3) if x is any 
term in the field of the relation, then the predecessors of x together with x 
together with the successors of x constitute the whole field of the relation. 

*Of. Part III of a paper by the present author, "On some Difficulties in the Theory of Transflnite Numbers 
and Order Types," Proe. London Math. Soc. Ser. II, Vol. IV, Part I. 

f Cf. loc. cit. for a statement of Zermelo's axiom, and for the proof that this axiom implies the multipli- 
cative axiom. The converse implication results as follows: Putting Prod 'k for the multiplicative class 
of k, consider 

Z<p = l({ (Ex) . xej3 . JD'R = c'/3 . ff'-B = i'x \ Df ., 
and assume 

ye Prod 'Z" el'a.S =fx{ (S) . KSey . f&j. 

Then R is a Zermelo-correlation. Hence if Prod 'Z" cl'a is not null, at least one Zermelo-correlation 
for a exists. 

% See Zermelo, "Beweis, dass jede Menge wohlgeordnet werden kann." Math. Annalen, Vol. LIX, 
pp. 514-516. 
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Well-ordered serial relations, which we will call £1, are defined as follows : 

a = P \ Pe Ser : a C (TP.E ! a. ) a . S ! (a — P"a)\ Df.; 

i. e., P generates a well-ordered series if P is serial, and any class a contained in 
the field of P and not null has a first term. (Note that P"a are the terms 
coming after some term of a). 

If we denote by No 'P the ordinal number of a well-ordered relation P, 
and by NO the class of ordinal numbers, we shall have 



No=aP\Pe£l.a = 8moT'P\ Df. 

NO- No 'a. 
From the definition of No we have 



Y : Ps a o • NdP = Smor 'P 
t:~(PeQ.).^.~E\NoP. 

If we now examine our definitions with a view to their connection with the 
theory of types, we see, to begin with, that the definitions of "Ser" and £1 
involve the fields of serial relations. Now the field is only significant when the 
relation is homogeneous; hence relations which are not homogeneous do not 
generate series. For example, the relation i might be thought to generate series 
of ordinal number a, such as 

e e e n e 

Xy C Xy t t Xy • • ' • I Xy • • • * , 

and we might attempt to prove in this way the existence of a and x . But x 
and Cx are of different types, and therefore there is no such series according to 
the definition. 

The ordinal number of a series of individuals is, by the above definition of 
No, a class of relations of individuals. It is therefore of a different type from 
any individual, and can not form part of any series in which individuals occur. 
Again, suppose all the finite ordinals exist as individual-ordinals; i. e., as the 
ordinals of series of individuals. Then the finite ordinals themselves form a 
series whose ordinal number is a ; thus a exists as an ordinal-ordinal, i. e., as 
the ordinal of a series of ordinals. But the type of an ordinal-ordinal is that of 
classes of relations of classes of relations of individuals. Thus the existence of 
a has been proved in a higher type than that of the finite ordinals. Again, the 
cardinal number of ordinal numbers of well-ordered series that can be made out 
of finite ordinals is tfi ; hence tfi exists in the type of classes of classes of classes 
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of relations of classes of relations of individuals. Also the ordinal numbers of 
well-ordered series composed of finite ordinals can be arranged in order of 
magnitude, and the result is a well-ordered series whose ordinal number is 6)j. 
Hence % exists as an ordinal-ordinal-ordinal. This process can be repeated any- 
finite number of times, and thus we can establish the existence, in appropriate 
types, of a n and a n for any finite value of n. 

But the above process of generation no longer leads to any totality of all 
ordinals, because, if we take all the ordinals of any given type, there are always 
greater ordinals in higher types ; and we can not add together a set of ordinals 
of which the type rises above any finite limit. Thus all the ordinals in any 
type can be arranged by order of magnitude in a well-ordered series, which has 
an ordinal number of higher type than that of the ordinals composing the series. 
In the new type, this new ordinal is not the greatest. In fact, there is no 
greatest ordinal in any type, but in every type all ordinals are less than some 
ordinals of higher type. It is impossible to complete the series of ordinals, 
since it rises to types above every assignable finite limit ; thus although every 
segment of the series of ordinals is well-ordered, we can not say that the whole 
series is well-ordered, because the "whole series" is a fiction. Hence Burali- 
Forti's contradiction disappears. 

Prom the last two sections it appears that, if it is allowed that the number 
of individuals is not finite, the existence of all Cantor's cardinal and ordinal 
numbers can be proved, short of K w and o M . (It is quite possible that the 
existence of these may also be demonstrable.) The existence of all finite car- 
dinals and ordinals can be proved without assuming the existence of anything. 
For if the cardinal number of terms in any type is n, that of terms in the next 
type is 2 n . Thus if there are no individuals, there will be one class (namely, 
the null-class), two classes of classes (namely, that containing no class and that 
containing the null-class), four classes of classes of classes, and generally 2 n_1 
classes of the nth order. But we can not add together terms of different types, 
and thus we can not in this way prove the existence of any infinite class. 

We can now sum up our whole discussion. After stating some of the para- 
doxes of logic, we found that all of them arise from the fact that an expression 
referring to all of some collection may itself appear to denote one of the col- 
lection; as, for example, "all propositions are either true or false" appears to 
be itself a proposition. We decided that, where this appears to occur, we are 
dealing with a false totality, and that in fact nothing whatever can significantly 
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be said about all of the supposed collection. In order to give effect to this 
decision, we explained a doctrine of types of variables, proceeding upon the 
principle that any expression which refers to all of some type must, if it denotes 
anything, denote something of a higher type than that to all of which it refers. 
Where all of some type is referred to, there is an apparent variable belonging to 
that type. Thus any expression containing an apparent variable is of higher type 
than that variable. This is the fundamental principle of the doctrine of types. 
A change in the manner in which the types are constructed, should it prove 
necessary, would leave the solution of contradictions untouched so long as this 
fundamental principle is observed. The method of constructing types explained 
above was shown to enable us to state all the fundamental definitions of mathe- 
matics, and at the same time to avoid all known contradictions. And it 
appeared that in practice the doctrine of types is never relevant except where 
existence-theorems are concerned, or where applications are to be made to some 
particular case. 

The theory of types raises a number of difficult philosophical questions con- 
cerning its interpretation. Such questions are, however, essentially separable 
from the mathematical development of the theory, and, like all philosophical 
questions, introduce elements of uncertainty which do not belong to the theory 
itself. It seemed better, therefore, to state the theory without reference to 
philosophical questions, leaving these to be dealt with independently. 



